WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications
نویسندگان
چکیده
A vocoder-based speech synthesis system, named WORLD, was developed in an effort to improve the sound quality of realtime applications using speech. Speech analysis, manipulation, and synthesis on the basis of vocoders are used in various kinds of speech research. Although several high-quality speech synthesis systems have been developed, real-time processing has been difficult with them because of their high computational costs. This new speech synthesis system has not only sound quality but also quick processing. It consists of three analysis algorithms and one synthesis algorithm proposed in our previous research. The effectiveness of the system was evaluated by comparing its output with against natural speech including consonants. Its processing speed was also compared with those of conventional systems. The results showed that WORLD was superior to the other systems in terms of both sound quality and processing speed. In particular, it was over ten times faster than the conventional systems, and the real time factor (RTF) indicated that it was fast enough for real-time processing. key words: speech analysis, speech synthesis, vocoder, sound quality, realtime processing
منابع مشابه
Using FFI Interpolator and VQ Quantization for Designing of High Quality 1200 BPS Speech Vocoder
Storaging or transmission of speech signals at very low bit rate is a hot area in the field of speech processing. We used stochastic inter-frame interpolators and vector quantization (VQ) as a new method for developing a high quality 1200 BPS speech vocoder. The objective and subjecgtive test results show that performance of the new vocoder is compairable with 4800 BPS standard vocoders (as CELP).
متن کاملDirect Modeling of Frequency Spectra and Waveform Generation Based on Phase Recovery for DNN-Based Speech Synthesis
In statistical parametric speech synthesis (SPSS) systems using the high-quality vocoder, acoustic features such as melcepstrum coefficients and F0 are predicted from linguistic features in order to utilize the vocoder to generate speech waveforms. However, the generated speech waveform generally suffers from quality deterioration such as buzziness caused by utilizing the vocoder. Although seve...
متن کاملTowards flexible speech coding for speech synthesis: an LF + modulated noise vocoder
This paper presents an ARX-LF-based model of speech that is amenable to low-bit-rate quantization and speech modifications directly at the parametric domain. The new model successfully addresses the non-deterministic part of voiced speech by modulating noise with the glottal flow, while unvoiced speech and transients are synthesized by modulating noise with a signal-derived time envelope. The p...
متن کاملUsing FFI Interpolator and VQ Quantization for Designing of High Quality 1200 BPS Speech Vocoder
Storaging or transmission of speech signals at very low bit rate is a hot area in the field of speech processing. We used stochastic inter-frame interpolators and vector quantization (VQ) as a new method for developing a high quality 1200 BPS speech vocoder. The objective and subjecgtive test results show that performance of the new vocoder is compairable with 4800 BPS standard vocoders (as CELP).
متن کاملNon-parametric techniques for pitch-scale and time-scale modification of speech
Time-scale and, to a lesser extent, pitch-scale modifications of speech and audio signals are the subject of major theoretical and practical interest. Applications are numerous, including, to name but a few, text-to-speech synthesis (based on acoustical unit concatenation), transformation of voice characteristics, foreign language learning but also audio monitoring or film/soundtrack post-synch...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEICE Transactions
دوره 99-D شماره
صفحات -
تاریخ انتشار 2016